Goto

Collaborating Authors

 Bass Strait





Test-Time Collective Prediction

Neural Information Processing Systems

An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents to make better predictions than they would individually, but may not be willing to release labeled data or model parameters.



AFP developing AI tool to decode gen Z slang amid warning about 'crimefluencers' hunting girls

The Guardian

Federal police say they have identified 59 alleged offenders as being in these online networks and have made an unspecified number of arrests. Federal police say they have identified 59 alleged offenders as being in these online networks and have made an unspecified number of arrests. Australian federal police will develop an AI tool to decode gen Z and Alpha slang and emojis in an effort to crackdown on sadistic online exploitation and "crimefluencers". The AFP commissioner, Krissy Barrett, used a speech at the National Press Club on Wednesday to warn of the rise of online crime networks of young boys and men who are targeting vulnerable teen and preteen girls. The newly appointed chief outlined how the perpetrators, who are overwhelmingly from English-speaking backgrounds, were grooming victims and then forcing them to "perform serious acts of violence on themselves, their siblings, others or their pets".


DiNo and RanBu: Lightweight Predictions from Shallow Random Forests

Santos, Tiago Mendonça dos, Izbicki, Rafael, Esteves, Luís Gustavo

arXiv.org Machine Learning

Random Forest ensembles are a strong baseline for tabular prediction tasks, but their reliance on hundreds of deep trees often results in high inference latency and memory demands, limiting deployment in latency-sensitive or resource-constrained environments. We introduce DiNo (Distance with Nodes) and RanBu (Random Bushes), two shallow-forest methods that convert a small set of depth-limited trees into efficient, distance-weighted predictors. DiNo measures cophenetic distances via the most recent common ancestor of observation pairs, while RanBu applies kernel smoothing to Breiman's classical proximity measure. Both approaches operate entirely after forest training: no additional trees are grown, and tuning of the single bandwidth parameter $h$ requires only lightweight matrix-vector operations. Across three synthetic benchmarks and 25 public datasets, RanBu matches or exceeds the accuracy of full-depth random forests-particularly in high-noise settings-while reducing training plus inference time by up to 95\%. DiNo achieves the best bias-variance trade-off in low-noise regimes at a modest computational cost. Both methods extend directly to quantile regression, maintaining accuracy with substantial speed gains. The implementation is available as an open-source R/C++ package at https://github.com/tiagomendonca/dirf. We focus on structured tabular random samples (i.i.d.), leaving extensions to other modalities for future work.




Approximately Unimodal Likelihood Models for Ordinal Regression

Yamasaki, Ryoya

arXiv.org Machine Learning

Ordinal regression (OR, also called ordinal classification) is classification of ordinal data, in which the underlying target variable is categorical and considered to have a natural ordinal relation for the underlying explanatory variable. A key to successful OR models is to find a data structure `natural ordinal relation' common to many ordinal data and reflect that structure into the design of those models. A recent OR study found that many real-world ordinal data show a tendency that the conditional probability distribution (CPD) of the target variable given a value of the explanatory variable will often be unimodal. Several previous studies thus developed unimodal likelihood models, in which a predicted CPD is guaranteed to become unimodal. However, it was also observed experimentally that many real-world ordinal data partly have values of the explanatory variable where the underlying CPD will be non-unimodal, and hence unimodal likelihood models may suffer from a bias for such a CPD. Therefore, motivated to mitigate such a bias, we propose approximately unimodal likelihood models, which can represent up to a unimodal CPD and a CPD that is close to be unimodal. We also verify experimentally that a proposed model can be effective for statistical modeling of ordinal data and OR tasks.